Search CORE

21 research outputs found

Affine Invariant Covariance Estimation for Heavy-Tailed Distributions

Author: Ostrovskii Dmitrii
Rudi Alessandro
Publication venue
Publication date: 25/06/2019
Field of study

In this work we provide an estimator for the covariance matrix of a heavy-tailed multivariate distributionWe prove that the proposed estimator

\widehat{\mathbf{S}}

admits an \textit{affine-invariant} bound of the form

(1-\varepsilon) \mathbf{S} \preccurlyeq \widehat{\mathbf{S}} \preccurlyeq (1+\varepsilon) \mathbf{S}

in high probability, where

\mathbf{S}

is the unknown covariance matrix, and

\preccurlyeq

is the positive semidefinite order on symmetric matrices. The result only requires the existence of fourth-order moments, and allows for

\varepsilon = O(\sqrt{\kappa^4 d\log(d/\delta)/n})

where

\kappa^4

is a measure of kurtosis of the distribution,

d

is the dimensionality of the space,

n

is the sample size, and

1-\delta

is the desired confidence level. More generally, we can allow for regularization with level

\lambda

, then

d

gets replaced with the degrees of freedom number. Denoting

\text{cond}(\mathbf{S})

the condition number of

\mathbf{S}

, the computational cost of the novel estimator is

O(d^2 n + d^3\log(\text{cond}(\mathbf{S})))

, which is comparable to the cost of the sample covariance estimator in the statistically interesing regime

n \ge d

. We consider applications of our estimator to eigenvalue estimation with relative error, and to ridge regression with heavy-tailed random design

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Finite-sample Analysis of M-estimators using Self-concordance

Author: Bach Francis
Ostrovskii Dmitrii
Publication venue
Publication date: 16/10/2018
Field of study

We demonstrate how self-concordance of the loss can be exploited to obtain asymptotically optimal rates for M-estimators in finite-sample regimes. We consider two classes of losses: (i) canonically self-concordant losses in the sense of Nesterov and Nemirovski (1994), i.e., with the third derivative bounded with the

3/2

power of the second; (ii) pseudo self-concordant losses, for which the power is removed, as introduced by Bach (2010). These classes contain some losses arising in generalized linear models, including logistic regression; in addition, the second class includes some common pseudo-Huber losses. Our results consist in establishing the critical sample size sufficient to reach the asymptotically optimal excess risk for both classes of losses. Denoting

d

the parameter dimension, and

d_{\text{eff}}

the effective dimension which takes into account possible model misspecification, we find the critical sample size to be

O(d_{\text{eff}} \cdot d)

for canonically self-concordant losses, and

O(\rho \cdot d_{\text{eff}} \cdot d)

for pseudo self-concordant losses, where

\rho

is the problem-dependent local curvature parameter. In contrast to the existing results, we only impose local assumptions on the data distribution, assuming that the calibrated design, i.e., the design scaled with the square root of the second derivative of the loss, is subgaussian at the best predictor

\theta_*

. Moreover, we obtain the improved bounds on the critical sample size, scaling near-linearly in

\max(d_{\text{eff}},d)

, under the extra assumption that the calibrated design is subgaussian in the Dikin ellipsoid of

\theta_*

. Motivated by these findings, we construct canonically self-concordant analogues of the Huber and logistic losses with improved statistical properties. Finally, we extend some of these results to

\ell_1

-regularized M-estimators in high dimensions

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Affine Invariant Covariance Estimation for Heavy-Tailed Distributions

Author: Ostrovskii Dmitrii,
Rudi Alessandro
Publication venue: HAL CCSD
Publication date: 25/06/2019
Field of study

International audienceIn this work we provide an estimator for the covariance matrix of a heavy-tailed multivariate distributionWe prove that the proposed estimator

\widehat{\mathbf{S}}

admits an \textit{affine-invariant} bound of the form

(1-\varepsilon) \mathbf{S} \preccurlyeq \widehat{\mathbf{S}} \preccurlyeq (1+\varepsilon) \mathbf{S}

in high probability, where

\mathbf{S}

is the unknown covariance matrix, and

\preccurlyeq

is the positive semidefinite order on symmetric matrices. The result only requires the existence of fourth-order moments, and allows for

\varepsilon = O(\sqrt{\kappa^4 d\log(d/\delta)/n})

where

\kappa^4

is a measure of kurtosis of the distribution,

d

is the dimensionality of the space,

n

is the sample size, and

1-\delta

is the desired confidence level. More generally, we can allow for regularization with level

\lambda

, then

d

gets replaced with the degrees of freedom number. Denoting

\text{cond}(\mathbf{S})

the condition number of

\mathbf{S}

, the computational cost of the novel estimator is

O(d^2 n + d^3\log(\text{cond}(\mathbf{S})))

, which is comparable to the cost of the sample covariance estimator in the statistically interesing regime

n \ge d

. We consider applications of our estimator to eigenvalue estimation with relative error, and to ridge regression with heavy-tailed random design

INRIA a CCSD electronic archive server

Finite-sample analysis of M-estimators using self-concordance

Author: Bach Francis
Ostrovskii Dmitrii,
Publication venue: HAL CCSD
Publication date: 30/11/2020
Field of study

The classical asymptotic theory for parametric

M

-estimators guarantees that, in the limit of infinite sample size, the excess risk has a chi-square type distribution, even in the misspecified case. We demonstrate how self-concordance of the loss allows to characterize the critical sample size sufficient to guarantee a chi-square type in-probability bound for the excess risk. Specifically, we consider two classes of losses: (i) self-concordant losses in the classical sense of Nesterov and Nemirovski, i.e., whose third derivative is uniformly bounded with the

3/2

power of the second derivative; (ii) pseudo self-concordant losses, for which the power is removed. These classes contain losses corresponding to several generalized linear models, including the logistic loss and pseudo-Huber losses. Our basic result under minimal assumptions bounds the critical sample size by

O(d \cdot d_{\text{eff}}),

where

d

the parameter dimension and

d_{\text{eff}}

the effective dimension that accounts for model misspecification. In contrast to the existing results, we only impose local assumptions that concern the population risk minimizer

\theta_*

. Namely, we assume that the calibrated design, i.e., design scaled by the square root of the second derivative of the loss, is subgaussian at

\theta_*

. Besides, for type-ii losses we require boundedness of a certain measure of curvature of the population risk at

\theta_*

.Our improved result bounds the critical sample size from above as

O(\max\{d_{\text{eff}}, d \log d\})

under slightly stronger assumptions. Namely, the local assumptions must hold in the neighborhood of

\theta_*

given by the Dikin ellipsoid of the population risk. Interestingly, we find that, for logistic regression with Gaussian design, there is no actual restriction of conditions: the subgaussian parameter and curvature measure remain near-constant over the Dikin ellipsoid. Finally, we extend some of these results to

\ell_1

-penalized estimators in high dimensions

INRIA a CCSD electronic archive server

Adaptive Denoising of Signals with Shift-Invariant Structure

Author: Harchaoui Zaid
Juditsky Anatoli
Nemirovski Arkadi
Ostrovskii Dmitrii
Publication venue
Publication date: 11/06/2018
Field of study

We study the problem of discrete-time signal denoising, following the line of research initiated by [Nem91] and further developed in [JN09, JN10, HJNO15, OHJN16]. Previous papers considered the following setup: the signal is assumed to admit a convolution-type linear oracle -- an unknown linear estimator in the form of the convolution of the observations with an unknown time-invariant filter with small

\ell_2

-norm. It was shown that such an oracle can be "mimicked" by an efficiently computable non-linear convolution-type estimator, in which the filter minimizes the Fourier-domain

\ell_\infty

-norm of the residual, regularized by the Fourier-domain

\ell_1

-norm of the filter. Following [OHJN16], here we study an alternative family of estimators, replacing the

\ell_\infty

-norm of the residual with the

\ell_2

-norm. Such estimators are found to have better statistical properties, in particular, we prove sharp oracle inequalities for their

\ell_2

-loss. Our guarantees require an extra assumption of approximate shift-invariance: the signal must be

\varkappa

-close, in

\ell_2

-metric, to some shift-invariant linear subspace with bounded dimension

s

. However, this subspace can be completely unknown, and the remainder terms in the oracle inequalities scale at most polynomially with

s

and

\varkappa

. In conclusion, we show that the new assumption implies the previously considered one, providing explicit constructions of the convolution-type linear oracles with

\ell_2

-norm bounded in terms of parameters

s

and

\varkappa

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server